The heatmap of the word matrix for SARS-CoV and SARS-CoV-2 genome
ata (3-mer word frequency matrix).
(a) (b)
a) The PCA map of the 3-mer word matrix for the SARS-CoV and SARS-CoV-
equences. (b) The ROC curves of the k-mer machine for discriminating between
CoV and SARS-CoV-2 genome sequences.
e 7.17(a) shows the PCA analysis for the word matrix. Again it
perfect discrimination power between two groups of SARS
sequences. The discrimination power of the k-mers word
y library has been already researched in the literature [Ghandi,
16; Fletez-Brant, et al., 2016; Lee, 2016; Beer, 2017; Shrikumar,
19]. Such a classifier is referred to as a k-mer machine. Three
tion algorithms were used, hence three k-mer machine models